A Greedy Community-Mining Algorithm Based on Clustering Coefficient

نویسنده

  • Aurel Cami
چکیده

A community in a large, real-life network, such as the World Wide Web (Web), has been broadly defined as a group of nodes that are densely linked with each other, while being sparsely linked with the rest of the nodes. In the last 2-3 years, community-mining in such networks has emerged as a problem of great practical significance. This problem has been framed in at least two different versions: (1) Partitioning into communities refers to partitioning a given network into subsets of nodes, each forming a community; and (2) Seed-growth refers to finding the community to which a given set of “seed” nodes belongs. While several algorithms—employing techniques ranging from hierarchical clustering to spectral partitioning and network flows—have been put forward for the former version, relatively little attention has been devoted to the latter. In this paper we propose a greedy, best-first algorithm for the seed-growth version of community-mining. Beginning with a set of seed nodes, this algorithm grows a community by repeatedly selecting some nodes from community’s neighborhood and placing them in the community. At each step, the algorithm uses clustering coefficient to decide which nodes from the neighborhood are to be included in the community. Our experimental results on both computergenerated and real-world networks indicate that this method may be effective in mining the communities of large networks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Optimization K-Modes Clustering Algorithm with Elephant Herding Optimization Algorithm for Crime Clustering

The detection and prevention of crime, in the past few decades, required several years of research and analysis. However, today, thanks to smart systems based on data mining techniques, it is possible to detect and prevent crime in a considerably less time. Classification and clustering-based smart techniques can classify and cluster the crime-related samples. The most important factor in the c...

متن کامل

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

Sampling from social networks’s graph based on topological properties and bee colony algorithm

In recent years, the sampling problem in massive graphs of social networks has attracted much attention for fast analyzing a small and good sample instead of a huge network. Many algorithms have been proposed for sampling of social network’ graph. The purpose of these algorithms is to create a sample that is approximately similar to the original network’s graph in terms of properties such as de...

متن کامل

CUSTOMER CLUSTERING BASED ON FACTORS OF CUSTOMER LIFETIME VALUE WITH DATA MINING TECHNIQUE

Organizations have used Customer Lifetime Value (CLV) as an appropriate pattern to classify their customers. Data mining techniques have enabled organizations to analyze their customers’ behaviors more quantitatively. This research has been carried out to cluster customers based on factors of CLV model including length, recency, frequency, and monetary (LRFM) through data mining. Based on LRFM,...

متن کامل

Clustering and Ranking University Majors using Data Mining and AHP algorithms: The case of Iran

Abstract: Although all university majors are prominent and the necessity of their presences is of no question, they might not have the same priority basis considering different resources and strategies that could be spotted for a country. This paper focuses on clustering and ranking university majors in Iran. To do so, a model is presented to clarify the procedure. Eight different criteria are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005